An interactive tutorial on text-to-speech synthesis from diphones in time domain

نویسندگان

  • Rüdiger Hoffmann
  • Bettina Ketzmerick
  • Ulrich Kordon
  • Steffen Kürbis
چکیده

We are presenting an interactive course on speech synthesis which is designed to support the education in speech communication. In the basic section, the fundamental principles of speech synthesis are explained. To explore a complete text-to-speech (TTS) system, the user is provided with access to the Dresden Speech Synthesizer DreSS. The user may type any text, and he may observe how the system processes the text from the first linguistic preprocessing until the acoustic synthesis. A further section is devoted to the crucial problem of correct segmentation of the speech elements used for the concatenative synthesis. The user may select his own diphone segments from a given speech data base. The quality of the segments may be evaluated acoustically, and hints are given to avoid errors in cutting. Thus, the user will learn how to select the segments with good quality. The course is written in HTML and Java and is designed for Internet application.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

Halfphones: A Backoff Mechanism for Diphone Unit Selection Synthesis

Diphone Backoff mechanisms in text-to-speech provide a means of ensuring that synthesis of the text takes place, even if some of the diphones in the text are missing in the speech database. This paper describes an automatic method for synthetically creating missing diphones from halfphones that are in the speech database.

متن کامل

Design and evaluation of prosodically-sensitive concatenative units for a Korean TTS system

This paper describes the design and evaluation of prosodically-sensitive concatenative units for a Korean text-to-speech (TTS) synthesis system. The diphones used are prosodically conditioned in the sense that a single conventional diphone is stored as different versions taken directly from the different prosodic domains of the prosodically labeled, read sentences. The four levels of the Korean...

متن کامل

A database design for a TTS synthesis system using lexical diphones

Database designs, if based on the premise that there are about 2000 diphones in English, as stated in many publications and on-line documents, are likely to render a database of diphones, which will fail to capture some important phonological phenomena of English. This paper proposes a TTS database, which is built from diphones inclusive of their syllabic stress; we term these units lexical dip...

متن کامل

Extraction of Di-phones for Telugu ::Issues and solutions

This paper describes a method for extraction of diphones to generate diphone database for concatenative text to speech systems. Diphone is an adjacent pair of phones. Diphone is a very important resource for both text to speech [TTS] and speech to text [STT]. Consider the pronunciation of -kaaki. It consists of phonemes [k], అ [a], అ [a], [k], ఇ[i]. The diphones generated while pronouncing the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999